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' Abstract. Distance function to a compact set plays a central role in 

' several areas of computational geometry. Methods that rely on it are 

robust to the perturbations of the data by the Hausdorff noise, but fail 
' in the presence of outliers. The recently introduced distance to a mea- 

D sure offers a solution by extending the distance function framework to 

reasoning about the geometry of probability measures, while maintain- 
' ing theoretical guarantees about the quality of the inferred information. 

, A combinatorial explosion hinders working with distance to a measure 

' as an ordinary (power) distance function. In this paper, we analyze an 

II' approximation scheme that keeps the representation linear in the size 

, of the input, while maintaining the guarantees on the inference quality 

' close to those for the exact (but costly) representation. 

C/3 ■ 

I ' 1. Introduction 

> : 

fS| ■ The problem of recovering the geometry and topology of compact sets 

I from finite point samples has seen several important developments in the pre- 

■ vious decade. Homeomorphic surface reconstruction algorithms have been 

• , proposed to deal with surfaces in sampled without noise [1] and with mod- 

I erate Hausdorff (local) noise [1 1] . In the case of submanifolds of a higher 

' dimensional Euclidean space |17) , or even for more general compact sub- 

i sets [4j, it is also possible, at least in principle, to compute the homotopy 

^ ' type from a Hausdorff sampling. If one is only interested in the homology 

of the underlying space, the theory of persistent homology [13] applied to 
Rips graphs provides an algorithmically tractable way to estimate the Betti 
^ j numbers from a finite Hausdorff sampling [B]. 

All of these constructions share a common feature: they estimate the 
geometry of the underlying space by a union of balls of some radius r centered 
at the data points P. A different way to interpret this union is as the 
r-sublevel set of the distance function to P, dp : x miupgp ||x — p\\. 
Distance functions capture the geometry of their defining sets, and they are 
stable to Hausdorff perturbations of those sets, making them well-suited for 
reconstruction results. However, they are also extremely sensitive to the 
presence of outliers (i.e. data points that lie far from the underlying set); all 
reconstruction techniques that rely on them fail even in presence of a single 
outlier. 
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To counter this problem, Chazal, Cohen-Steiner, and Merigot [5j devel- 
oped a notion of distance function to a probability measure that retains the 
properties of the (usual) distance important for geometric inference. Instead 
of assuming an underlying compact set that is sampled by the points, they 
assume an underlying probability measure ^ from which the point sample 
P is drawn. The distance function d^^mo to the measure n depends on a 
mass parameter mo E (0,1). This parameter acts as a smoothing term: a 
smaller niQ captures the geometry of the support better, while a larger mo 
leads to better stability at the price of precision. The crucial feature of the 
function d^^^o is its stability to the perturbations of the measure ^ under the 
Wasserstein distance, defined in Section 12.21 For instance, the Wasserstein 
distance between the underlying measure and the uniform probability mea- 
sure on the point set P can be small even if P contains some outliers. When 
this happens, the stability result ensures that distance function dip.mo to 
the uniform probability measure Ip on P retains the geometric information 
contained in the underlying measure fi and its support. 

Computing with distance functions to measures. In this article we 
address the computational issues related to this new notion. If P is a subset 
of containing points, and mo = k/N, we will denote the distance 
function to the uniform measure on P by dp^k- As observed in [5|, the value 
of dp^k at a given point x is easy to compute: it is the square root of the 
average squared distance from the point x to its k nearest neighbors in P. 
However, most inference methods require a way to represent the function, 
or more precisely its sublevel sets, globally. It turns out that the distance 
function dp^k can be rewritten as a minimum 

(1) di) J x) = mm \\x — c\\'^ — Wc, 




where c ranges over the set of barycenters of k points in P (see Section [3]) . 
Computational geometry provides a rich toolbox to represent sublevel sets 
of such functions, for example, via weighted a-complexes [12j. 

The difficulty in applying these methods is that to get an equality in ([1]) 
the minimum number of barycenters to store is the same as the number 
of order-/c Voronoi sites of P, making this representation unusable even for 
modest input sizes. The solution that we propose is to construct an approx- 
imation of the distance function dp^^, defined by the same equation as ([TJ, 
but with c ranging over a smaller subset of barycenters. In this article, we 
study the quality of approximation given by a linear-sized subset: the wit- 
nessed barycenters defined as the barycenters of any k points in P whose 
order-fc Voronoi cell contains at least one of the sample points. The algo- 
rithmic simplicity of the scheme is appealing: we only have to find the k — 1 
nearest neighbors for each input point. We denote by dp^ and call witnessed 
k-distance the function defined by Equation ([T]), where c ranges over the 
witnessed barycenters. 
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Contributions. Our goal is to give conditions on tlie point cloud P under 
which the witnessed /c-distance dp^ provides a good uniform approximation 
of the distance to measure dp^fc. We first give a general multiplicative bound 
on the error produced by this approximation. However, most of our pa- 
per (Sections |4] and [5]) analyzes the uniform approximation error, when P 
is a set of independent samples from a measure concentrated near a lower- 
dimensional subset of the Euclidean space. The following is a prototypical 
example for our setting, although the analysis we propose allows for a wider 
range of problems. Note that some of the common settings in the literature 
either fit directly into this example, or in its logic: the mixture of Gaus- 
sians [10] and off-manifold Gaussian noise in normal directions |16j are two 
examples. 

(HI) We assume that the "ground truth" is an unknown probability mea- 
sure /i whose dimension is bounded by a constant i <^ d. Practically, 
this means that n is concentrated on a compact set C M whose 
dimension is at most i, and that its mass distribution shouldn't "for- 
get" any part of K (see Definition [3|) . As an example fj, could be the 
uniform measure on a smooth compact ^-dimensional submanifold 
JC, or on a finite union of such submanifolds. 

This hypothesis ensures that the distance to the measure // is close to the 
distance to the support K of /i, and lets us recover information about K. 
Our first result (Witnessed Bound Theorem [2]) states that if the uniform 
measure to a point cloud P is a good Wasserstein-approximation of ^, then 
the witnessed /c-distance to P provides a good approximation of the distance 
to the underlying compact set K. The bound we obtain is only a constant 
times worse than the bound for the exact fc-distance. 

(H2) The second assumption is that we are not sampling directly from fi, 
but through a noisy channel. We model this by considering that our 
measurements come from a measure u, which is obtained by adding 
noise to /i. For instance, v could be the result of the convolution 
of fi with a Gaussian distribution J\f{0,d~^a'^l) whose variance is 
o"^. More generally, u can be any measure such that the Wasserstein 
distance from /u to z/ is at most a. This generalization allows, in par- 
ticular, to consider noise models that are not translation-invariant. 

(H3) Finally, we suppose that our input data set P Q M.'^ consists of 

points drawn independently from the noisy measure i^. Denote with 
Ip the uniform measure on P. 

These two hypothesis allow us to control the Wasserstein distance between fi 
and 1 p with high probability. We assume that the point cloud P is gathered 
following the three hypothesis above. Our second result states that the 
witnessed fc-distance to P provides a good approximation of the distance to 
the compact set K with high probability, as soon as the amount of noise a 
is low enough and the number of points N is large enough. 
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(a) Data (b) Sublevel sets 



Figure 1. (a) 6000 points sampled from a sideways figure 
8 (in red), with circle radii Ri = y/2 and R2 = \/9/8. The 
points are sampled from the uniform measure on the figure-8, 
convolved with the Gaussian distribution M{0, o"^) where a = 
.45. (b) r-sublevel sets of the witnessed (in gray) and exact 
(additional points in black) fc-distances with mass parameter 
mo = 50/6000, and r = .239. 

Approximation Theorem (Theorem [H). Let P be a set of N points 
drawn according to the three hypothesis (H1)-(H3), let k £ {!,..., A^} 
and mo = k/N. Then, the error bound 

\\dj>^k - di^lloo < 54mo + 24ml^^a-^/^ 
holds with probability at least 

1 - exp(-/3^Amax(cj2+2^, a^) - ein{a)) 

where the constants and 7^ depend only on /u. 

We illustrate the utility of the bound with an example and a topological 
inference statement in our final Section [6j 

Outline. The relevant background appears in Section [2l We present our 
approximation scheme together with a general bound of its quality in Section 
[31 We analyze its approximation quality for measures concentrated on low- 
dimensional subsets of the Euclidean space in Section 21 The convergence 
of the uniform measure on a point cloud sampled from a measure of low 
complexity appears in Section [H and leads to our main result. 

2. Background 
We begin by reviewing the relevant background. 

2.1. Measure. Let us briefly recap the few concepts of measure theory that 
we use. A non-negative measure fi on the space is a mass distribution. 
Mathematically, it is defined as a function that maps every (Borel) subset 
B of R'' to a non- negative number n{B), which is additive in the sense 
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that /X {L)i£j\fBi) = ^{Bi) whenever [Bi) is a countable family of disjoint 
(Borel) subsets of W^. The total mass of a measure /i is mass(^) = 
A measure ^ is called a probability measure if its total mass is one. The 
support of a probability measure /i, denoted by spt(/i) is the smallest closed 
set whose complement has zero measure. The expectation or mean of // 
is the point E(/i) = J^j_ xdfi{x)] the variance of fi is the number o"^ = 

4rf||x-E(/i)||2d/i(x). 

Although the results we present are often more general, the typical proba- 
bility measures we have in mind are of two kinds: (i) the uniform probability 
measure defined by the volume form of a lower-dimensional submanifold of 
the ambient space and (ii) discrete probability measures that are obtained 
through noisy sampling of probability measures of the previous kind. For any 
finite set P with N points, denote by Ip the uniform measure supported on 
P, i.e. the sum of Dirac masses centered at p E P with weight 

2.2. Wasserstein distance. A natural way to quantify the distance be- 
tween two measures is the Wasserstein distance. This distance measures the 
L^-cost of transporting the mass of the first measure onto the second one. A 
general study of this notion and its relation to the problem of optimal trans- 
port appear in |18| . We first give the general definition and then explain its 
interpretation when one of the two measures has finite support. 

A transport plan between two measures // and u with the same total mass 
is a measure vr on the product space x such that for every subsets 
A, B of R'^, 7r{A x M*^) = fi{A) and 7r(M°' x B) = u{B). Intuitively, 7r(A x B) 
represents the amount of mass of ^ contained in A that will be transported 
to B by vr. The cost of this transport plan is given by 

c(7r) := ( / ||x — y||^d7r(x, 2/) 



Finally, the Wasserstein distance between /i and v is the minimum cost of a 
transport plan between these measures. 

Consider the special case where the measure v is supported on a finite 
set P. This means that v can be written as "^p^pOipSp, where 6p is the 
unit Dirac mass at P. Moreover, Up must equal the total mass of jj,. A 
transport plan vr between /i and v corresponds to a decomposition of fi into 
a sum of positive measures X^pep/^p such that mass(/Xp) = ap. The squared 
cost of the plan defined by this decomposition is then 



c{tt) 



1/2 

p|pd/xp(x) 



Wasserstein noise. Two properties of the Wasserstein distances are worth 
mentioning for our purpose. Together, they show that the Wasserstein noise 
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and sampling model generalize the commonly used empirical sampling with 
Gaussian noise model: 

• Consider a probability measure ^ and / : M*^ — )• M the density of a 
probability distribution centered at the origin, and denote by v the 
result of the convolution of /i by /. Then, the Wasserstein distance 
between \jl and v is at most a, where := Jj^^ ||x|p/(a;)da; is the 
variance of the probability distribution defined by /. 

• Let P denote a set of points drawn independently from a given 
measure v. Then, the the Wasserstein distance W2(i^, Ip) between 
V and the uniform probability measure on P converges to zero as 

grows to infinity with high probability. Examples of such as- 
ymptotic convergence results are common in statistics, e.g. [3] and 
references therein. In Proposition [3] below, we give a quantitative 
non-asymptotic result assuming that v is low-dimensional (HI). 

Using the notation introduced in the two items above, one has 

lim sup W2(/U, lp)<0" 

Af-5-+oo 

with high probability as the number of point grows to infinity. A more 
quantitative version of this statement can be found in Corollary [TJ 

2.3. Distance-to-measure and fc-distance. In [5j, the authors introduce 
a distance to a probability measure as a way to infer the geometry and 
topology of this measure in the same way the geometry and topology of a 
set is inferred from its distance function. Given a probability measure /i and 
a mass 'parameter rriQ £ (0, 1), they define a distance function d^^mo which 
captures the properties of the usual distance function to a compact set that 
are used for geometric inference. 

Definition 1. For any point x in W^, let 5^^m{x) be the radius of the 
smallest ball centered at x that contains a mass at least m of the mea- 
sure //. The distance to the measure ^ with parameter mo is defined by 

Given a point cloud P containing A^ points, the measure of interest is 
the uniform measure Ip on P. When tuq is a fraction k/N of the number 
of points (where k is an integer), we call k-distance and denote by dp^fc the 
distance to the measure dip^mo- The value of dp^fc at a query point x is given 
by 

dp,fc(a;) = -^ \\x-pf. 
peNN^(x) 

where NNp(x) C P denotes the k nearest neighbors in P to the point x G W^. 
(Note that while the k-th. nearest neighbor itself might be ambiguous, on the 
boundary of an order-A: Voronoi cell, the distance to the A:-th nearest neighbor 
is always well defined, and so is dp^^-) 
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The most important property of the distance function d^^mo is its stabil- 
ity, for a fixed mo, under perturbations of the underlying measure fi. This 
property provides a bridge between the underlying (continuous) n and the 
discrete measures Ip. According to [5l, Theorem 3.5], for any two probability 
measures /i and u on'K'^, 

— 1/2 

(2) II d/,,mo -d;.,mo lloo < "Iq W2(/i, z^), 

where W2(/i, i') denotes the Wasserstein distance between the two measures. 
The bound in this inequality depends on the choice of niQ, which acts as a 
smoothing parameter. 



3. Witnessed A;-Distance 

In this section, we describe a simple scheme for approximating the dis- 
tance to a uniform measure, together with a general error bound. The main 
contribution of our work, presented in Section IH is the analysis of the quality 
of approximation given by this scheme when the input points come from a 
measure concentrated on a lower-dimensional subset of the Euclidean space. 

3.1. fc-Distance as a Power Distance. Given a set of points U = {ui, . . . , u„} 
in M*^ with weights Wu for every u G [/, we call power distance to U the 
function pow^j obtained as the lower envelope of all the functions x i— )• 
||m — x|p — Wu, where u ranges over U. By Proposition 3.1 in [5], we can 
express the square of any distance to a measure as a power distance with 
non-positive weights. The following proposition recalls this property of the 
fc-distance dp^k- 

Proposition 1. For any P C M.^, denote by Bary^(P) the set of bary centers 
of any subset of k points in P. Then 

(3) dp;;, = min |||x - c|p - Wc; cGBary^(P)|, 

where the weight of a barycenter c = ^ '^iPi ^■^ given by Wc '■= ||c— pp. 



Proof. For any subset C oi k points in P, define 

Denoting by c the barycenter of the points in C, an easy computation shows 

= — \x — p\\' = \\x — c||^ — Wc 
pec 

where the weight is given by Wc = —\ X^pec P~PlP- The proposition follows 
from the definition of the fc-distance. □ 
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In other words, the square of the /c-distance function to P coincides exactly 
with the power distance to the set of barycenters Bary'^(P) with the weights 
defined above. From this expression, it foUows that the sublevel sets of the 
A;-distance dp^k are finite unions of bahs, 

dp,l([0,P])= U B(c-,(p2 + u;,-)i/2). 

ceNN^{R<^) 

Therefore, ignoring the complexity issues, it is possible to compute the ho- 
motopy type of this sublevel set by considering the weighted alpha-shape of 
Bary'^(P) (introduced in |12)). which is a subcomplex of the regular triangu- 
lation of the set of weighted barycenters. 

From the proof of Proposition [1] we also see that the only barycenters 
that actually play a role in ([3]) are the barycenters of k points of P whose 
order-Zc Voronoi cell is not empty. However, the dependence on the number 
of non-empty order- /c Voronoi cells makes computation intractable even for 
moderately sized point clouds in the Euclidean space. 

One way to avoid this difficulty is to replace the /c-distance to P by an 
approximate /c-distance, defined as in Equation ([3]), but where the minimum 
is taken over a smaller set of barycenters. The question is then: given a 
point set P, can we replace the set of barycenters Baryp in the definition 
of fc-distance by a small subset B while controlling the approximation error 

II 1/2 1 II o 

IIPOW^ -dp,fc||oo' 

This approach is especially attractive since many geometric and topologi- 
cal inference methods using distance functions to compact sets or to measures 
continue to hold when one of the distance functions is replaced by a good 
approximation in the class of power distances. 

3.2. Approximating by witnessed A;-distance. In order to approach this 
question, we consider a subset of the supporting barycenters suggested by 
the input data which we call witnessed barycenters. The answer to the 
question is then essentially positive when the input point cloud P satisfies 
the hypotheses (H1)-(H3). 

Definition 2. For every point x in P, the barycenter of x and its (k — 1) 
nearest neighbors in P is called a witnessed k-hary center. Let Bary^(P) be 
the set of all such barycenters. We get one witnessed barycenter for every 
point X of the sampled point set, and define the witnessed k-distance, 

dj^f, = mm{\\x - cf -w-c;cG Baryt(P)}. 

Computing the set of all witnessed barycenters of a point set P only 
requires finding the k — 1 nearest neighbors of every point in P. This search 
problem has a long history in computational geometry ^ [7l [14] , and now 
has several practical implementation. 

General error bound. Because the distance functions we consider are 
defined by minima, and Bary^(P) is a subset of Bary^(P), the witnessed 
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/c-distance is always greater than the exact /c-distance. In the lemma below, 
we give a general multiplicative upper bound. This lemma does not assume 
any specific property for the input point set P. However, even such a coarse 
bound can be used to estimate Betti numbers of sublevel sets of dp^^, using 
arguments similar to those in |6]. 

Lemma 1 (General Bound). For any finite point set P C M*^ and < k < 

\P\, one has 

dp,k < d?,fc <i2 + V2) dp,fc 

Proof. Let y E M.'^ be a point, and p the barycenter associated to a cell that 
contains y. This translates into dpk{y) = dp(y). In particular, \\p — y\\ < 
dp,fc(y) and ^/-Wp < dp^kiv)- 

Let us find a witnessed barycenter q that is close to p. We know that p 
is the barycenters of k points xi, . . . , Xn, and that —Wp = ^ Yli=i W^i ~ 
Consequently, there should exist an Xi such that — p|| < y/—Wp. Let q 
be the barycenter witnessed by x. Then, 

d^M < My) < dg{x) + \\x - y\\ 

< dp{x) + \\x-p\\ + \\p-y\\ 

Combining the inequality 

dp{x) = {\\x -pf - WpY^"^ < V2y^~Wp 
together with ||x — p|| < ^/—Wp, we get 

d?,fc(y) < (1 + V2)y^p + \\p - y\\ 

< (2 + ^/2) dp,fc(y) □ 

4. Approximation Quality 

Let us recall briefiy our hypothesis (H1)-(H3). There is an ideal, well- 
conditioned measure /i on supported on an unknown compact set K. We 
also have a noisy version of /i, that is another measure z/ with W2(/i, i') < cr, 
and we suppose that our data set P consists of N points independently sam- 
pled from I'. In this section we give conditions under which the witnessed 
/c-distance to P provides a good approximation of the distance to the under- 
lying set K. 

4.1. Dimension of a measure. First, we make precise the main assump- 
tion (HI) on the underlying measure which we use to bound the approx- 
imation error made when replacing the exact by the witnessed A:-distance. 
We require fi to be low dimensional in the following sense. 

Definition 3. A measure /x on M'* is said to have dimension at most i, 
which we denote by dim/i < £, if there is a positive constant such that 
the amount of mass contained in the ball B{p,r) is at least a^r^, for every 
point p in the support of and every r smaller than the diameter of this 
support. 
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The important assumption here is that the lower bound fi(B{p,r)) > ar 
should be true for some positive constant a and for r smaller than a given 
constant R. The choice oi R = diam(spt(/u)) provides a normalization of the 
constant and slightly simplifies the statements of the results. 

Let M be an ^-dimensional compact submanifold of M'^, and / : M — t- R 
a positive weight function on M with values bounded away from zero and 
infinity. Then, the dimension of the volume measure on M weighted by the 
function / is at most i. A quantitative statement can be obtained using the 
Bishop-Giinther comparison theorem; the bound depends on the maximum 
absolute sectional curvature of the manifold M (see e.g. Proposition 4.9 in 
[5]). Note that the positive lower bound on the density is really necessary. 
For instance, the dimension of the standard Gaussian distribution M{0, 1) 
on the real line is not bounded by 1 — nor by any positive constant. (This 
fact follows since the density of this distribution decreases to zero faster than 
any polynomial as one moves away from the origin.) 

It is easy to see that if m measures ^i, . . . , have dimension at most i, 
then so does their sum. Consequently, if (Mj) is a finite family of compact 
submanifolds of M'^ with dimensions {dj), and fij is the volume measure 
on Mj weighted by a function bounded away from zero and infinity, the 
dimension of the sum fi = X^JLi is at most maxj dj. 

4.2. Bounds. In the remaining of this section, we bound the error between 
the witnessed A;-distance dp^ and the (ordinary) distance dx to the compact 
set K. We start from a proposition from [5j that bounds the error between 
the exact A;-distance dp^k and d^-: 

Theorem 1 (Exact Bound). Let fi denote a probability measure with dimen- 
sion at most i, and supported on a set. Consider the uniform measure Ip 
on a point cloud P, and set rriQ = k/\P\. Then 

\\dp,k - d/r||oo < W2(/i, Ip) + a^^/^m^^ 

Proof. Recall that dp^k = dip^mo- Using the triangle inequality and Equa- 
tion ([2]), one has 

||dip,mo - dii-||oo < lid;, 

,mo dip^jT2ol|oo + l|d/,,mo — d/f II 

— 1/2 

< mo W2(Ai, Ip) + ||d^,mo - di^lloo 

Then, from Lemma 4.7 in [5], ||d^^mo ~ dii-||oo < a^t^^^m-Q^^, and the claim 
follows. □ 

In the main theorem of this section, the exact A:-distance in the above 
bound is replaced by the witnessed /c-distance. 

Theorem 2 (Witnessed Bound). Let n be a probability measure satisfying 
the dimension assumption and let K be its support. Consider the uniform 
measure Ip on a point cloud P, and set mo = A;/|P|. Then, 

\\dp,k - d/^lloo < em" W2(m, Ip) + 24mo^'a;i/^. 
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Observe that the error term given by this theorem is a constant factor 
times the bound in the previous theorem. Before proceeding with the proof, 
we prove an auxihary lemma, which emphasizes that a measure v, close to a 
measure n satisfying an upper dimension bound (as in Definition |3]) , remains 
concentrated around the support of fi. 

Lemma 2 (Concentration). Let ^ he a probability measure satisfying the 
dimension assumption, and v he another prohahility measure. Let niQ he a 
mass parameter. Then, for every point p in the support of fj,, z^(B(p, r/)) > mo, 

where rj = W2(/i, v) + Ani'J'^'^^^^ . 

Proof. Let tt be an optimal transport plan between v and ^u. For a fixed point 
p in the support of i^T, let r be the smallest radius such that B(p, r) contains at 
least 2mo of mass Consider now a submeasure (j! of jJL of mass exactly 2mo 
and whose support is contained in the ball B(|?, r). This measure is obtained 
by transporting a submeasure v' of v by the optimal transport plan vr. Our 
goal is to determine for what choice of rj the ball B(p, 77) contains a z^'-mass 
(and, therefore, a z^-mass) of at least ttt-q. We make use of the Chebyshev's 
inequality for v' to bound the mass of u' outside of the ball B(p, ry): 

.1 (Tad \ T>f^ „^^ ,.'/' f™ r- in>d 



V 



\ B(p, T])) = u'{{x e M""; \\x - p\\ > r]}) 



^ 1 /■ „ 

< — ^ / If ~ p\\ 
1] J 

Observe that the right hand term of this inequality is exactly the Wasserstein 
distance between /j,' and the Dirac mass 2mo5p. We bound it using the 
triangle inequality for the Wasserstein distance: 

J \\x-pfdu' = Wl{u',2mo6p) 

< (W2(/i', 1^') + W2(/u', 2mo6p)f 

< (W2(Ai,i^) + 2mor)2 

Combining equations @ and jS]), we get: 

iy{B{p, ??)) > u'{B{p, r?)) > z.'(M'^) - iy'{R'' \ B{p, rj)) 

{W2{^l,l^) + 2morf 
> 2mo 5 . 

By the lower bound on the dimension of ^u, and the definition of the radius 
r, one has r < {2mQ/ Up)^/^ . Hence, the ball B(p, ?]) contains a mass of at 
least rriQ as soon as 

(W2(^,z^) + a;i2i+V^mJ+^/^)2 

^ < ruQ. 

rf^ 

This will be true, in particular, if rj is larger than 

W2(/x,z.)m-^/V4a;V^mf+^/^ □ 
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Proof of the Witnessed Bound Theorem. Since the witnessed /c-distance is a 
minimum over fewer barycenters, it is larger than the real fc-distance. Using 
this fact and the Exact Bound Theorem one gets the lower bound: 

d?,fc > dp,fc >dK- mo W2(/i, Ip) + a-^/'mf 
For the upper bound, if we set rj as in Lemma[2l for every point p in K, the 
ball B(p, ?]) contains at least k points in P. Consider one of these points xi; 
its (A; — 1) nearest neighbors X2, ■ ■ ■ ,Xk in P cannot be at a distance greater 
than 2r] from xi. Hence, the points xi, . . . ,Xk belong to the ball B(p, 3rj) and 
so does their barycenter. This shows that the set W of witnessed barycenters, 
obtained by this construction, is a 3ry-covering of K, that is dy\r < dx + 3r/. 
Since the weight of any barycenter in W is at most 3rj, we get dp^ < dw + 3r]. 
To sum up, 

dp,fe <dw + Sr]<dK + 6v 
Replacing ry by its value from the Concentration Lemma concludes the proof. 

□ 

5. Convergence under Empirical Sampling 

One term remains moot in the bound in Theorem [2l namely the Wasser- 
stein distance W2(^, Ip). In this section, we analyze its convergence. The 
rate depends on the complexity of the measure //, defined below. The moral 
of this section is that if a measure can be well approximated with few points, 
then it is also well approximated by random sampling. 

Definition 4. The complexity of a probability measure // at a scale e > is 
the minimum cardinality of a finitely supported probability measure u which 
e-approximates /x in the Wasserstein sense, i.e. such that W2(//, J^) < e. We 
denote this number by A/)j(e). 

Observe that this notion is very close to the e-covering number of a com- 
pact set K, denoted by Mxi^), which counts the minimum number of balls 
of radius £ needed to cover K. It's worth noting that if measures p, and i/ 
are close — as are the measure fi and its noisy approximation v in the pre- 
vious section — and fi has low complexity, then so does the measure v. The 
following lemma shows that measures satisfying the dimension assumption 
have low complexity. Its proof follows from a classical covering argument, 
that can be found e.g. in Proposition 4.1 of |15) . 

Lemma 3 (Dimension-Complexity). Let K he the support of a measure fi 
with dim/x < i. Then, 

(i) for every positive e, Mxi^) < ct^/e^ . Said otherwise, the upper box- 
counting dimension of K is bounded: 

dim(A') := limsuplog(A/i^(e))/log(l/e) < i. 

(ii) for every positive e, M^{e) < a^B^/e^. 
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Theorem 3 (Convergence). Let ji he a probability measure on whose 
support has diameter at most D, and let P be a set of N points independently 
drawn from the measure fx. Then, e > 0, 

P(W2(lp,/x) < 4e) > 1 -Ar^(£)exp(-2A^eV(DAr^(£))2) 

- exp(-2A^£V^^) 

Proof. Let n be a fixed integer, and e be the minimum Wasserstein distance 
between /i and a measure p, supported on (at most) n points. Let S be 
the support of the optimal measure p,, so that fi can be decomposed as 
X^5g5"s<^s ("s > 0). Let TT be an optimal transport plan between /x and fl; 
this is equivalent to finding a decomposition of /i as a sum of n non-negative 
measures {7rs)seS such that mass(7rs) = a^, and 

I \\x-sfd'Ks{x) = e''=W2{ii,Jif 

Drawing a random point X from the measure ju, amounts to (i) choosing 
a random point s in the set S (with probability a^) and (ii) drawing a ran- 
dom point X following the distribution tt^. Given A'^ independent points 
Xi, . . . ,Xn drawn from the measure /x, denote by Is,n the proportion of 
the {Xi) for which the point s was selected in step (i). Hoeffding's inequal- 
ity allows to easily quantify how far the proportion Ig^N deviates from ag'- 
'^{\Is,N — Cis\ > 5) < exp(— 2A^(5^). Combining these inequalities for every 
point s and using the union bound yields 




\Is,N -ois\<5 \ > 1 - nexp(-2iV(57n^). 
se5 / 
For every point s, denote by vr^ the distribution of the distances to s in the 
submeasure tt^, i.e. the measure on the real line defined by tTs{I) ■= vr^dx G 
M*^; \\x — s\\ G /}) for every interval /. Define fi as the sum of the tTs', by the 
change of variable formula one has 



/ 

Jm. 



t^dfl{t) = V / t'^dfcs = V / Ik - sfdiTs = 



s s 



Given a random point Xi sampled from /x, denote by Yi Euclidean distance 
between the point Xi and the point s chosen in step (i) . By construction, the 
distribution of Yi is given by the measure jl; using the HoefFding inequality 
again one gets 



< 1 - exp(-2iV77^£7L» 



In order to conclude, we need to define a transport plan from the empirical 
measure Ip = to the finite measure p,. To achieve this, we order 

the points (Xj) by increasing distance Yf, then transport every Dirac mass 
j^Sxi to the corresponding point s in 5 until s is "full", i.e. the mass is 
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reached. The squared cost of this transport operation is at most X]i=i ^j^- 
Then distribute the remaining mass among the s points in any way; the 
cost of this step is at most D times YlseS \-^s,N — c^sl- The total cost of this 
transport plan is the sum of these two costs. From what we have shown 
above, setting rj = e and 6 = e/D, one gets 

P(W2(lp,Ai) <4e) > l-nexp{-2Ne'^/{Dnf) 

- exp(-2iVeV^^) ^ 

As a consequence of the Dimension-Complexity Lemma [3] and of the Con- 
vergence Theorem [3l any measure ^ satisfying an upper bound on its di- 
mension is well approximated by empirical sampling. A result similar to 
the Convergence Theorem follows when the samples are drawn not from the 
original measure /i, but from a "noisy" approximation v which need not be 
compactly supported: 

Corollary 1 (Noisy Convergence). Let /x, z/ be two probability measures on 
M'^ with W2(/i, 1^) = (J, and P be a set of N points drawn independently from 
the measure v . Then, 

P(W2(lp,/i) < 9a) > 1 - AA^(a)exp(-8iV(TV(W^(fT))') 

-exp(-327Vcj^/L>2). 

Proof One only needs to apply the previous Convergence Theorem to the 
measures and Ip: 

P(W2(i/,lp) < 4e) > 1 -AA^(e)exp(-2iVeV(W(e))') 

(6) - exp(-2iVeV^^) 

Set £ = 2a and recall that by definition Ny{2a) < J\f^{cr). Then, using 
W2(lp,/x) < W2(lp, T^) + <J one has 

P(W2(lp,^) < 9a) > P(W2(lp,z^) < 8a) 

We conclude by using Eq. ([6]) with e = 2a. □ 

It is now possible to combine Theorem [2] (Witnessed Bound) , Corollary [1] 
(Noisy Convergence) and Lemma [3] (Dimension-Complexity) to get the fol- 
lowing probabilistic statement. 

Theorem 4 (Approximation). Suppose that ji is a measure satisfying the 
dimension assumption, supported on a set K of diameter D, and v a noisy 
approximation of fi, i.e. W2(/i, i^) < a. Let P be a set of N points indepen- 
dently sampled from u. Then, the inequality 

Wp,k - di^lloo < 54mo + 2^m]!^a~^'^ 
holds with probability at least 

1 - 7^exp(-/3/,iVmax(a2+2^, a^) - ^ln((T)), 
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where = max ^^^2 1 32 and 7^ = 1 + a^5^. 

Proof. Thanks to the Witnessed Bound Theorem and the Noisy Convergence 
Corohary, the inequahty holds with probabihty at least: 

1 - M^{a)) exp{-8Na^/{DM^{a)f) - exp(-32iVcj^/L»2) 

We use Lemma [3] to lower bound the covering number N^{(t) by a^b^/a^. 
Hence, the previous expression is bounded from below by 

1 - a^5^exp(-8iVcj2+2VP«M50^ - ^ln(cj)) - exp(-32A^cjV^^) 
> 1 -7^exp(-/3^A^max((j2+2^,cj^) -^ln(cj)) 

where 7^ = l + a^5^ and = ^ max 



6. Discussion 



32 



as stated in the theorem. 

□ 



We illustrate the utility of the bound in the Witnessed Bound Theorem 
by example and an inference statement. Figure [1] shows 6000 points drawn 
from the uniform distribution on a sideways figure-8 (in red) , convolved with 
a Gaussian distribution. The ordinary distance function to the point set has 
no hope of recovering geometric information out of these points since both 
loops of the figure-8 are filled in. On the right, we show the sublevel sets 
of the distance to the uniform measure on the point set, both the witnessed 
/c-distance and the exact /c-distance. Both functions recover the topology 
of figure-8, the bits missing from the witnessed /c-distance smooth out the 
boundary of the sublevel set, but do not affect the image at large. 

Inference. Suppose that we are in the conditions of the Approximation 
Theorem, but additionally we assume that the support K of the original 
measure // has a weak feature size larger than R. This means that the 
distance function dx has no critical value in [0,i?], and implies that all 
the offsets K"^ = [0, r] of K are homotopy equivalent for r G (0, i?). 
Suppose again that we have drawn a set P oi N points from a Wasserstein 
approximation u of ^u, such that W2(/U,i/) < a. From the Approximation 
Theorem, we have 

1/2^ , 94™V^^-l/£ 



lldpj^, — di^lloo < e(mo) := 54mQ a + 24mQ a 



with high probability as N goes to infinity. Then, the standard argument 
[8] shows that the Betti numbers of the compact set K can be inferred from 
the function dp^i., which is defined only from the point sample P, as long 
as e(mo) is less than i?/4 (see the Appendix). In the language of persistent 
homology |13| . the persistent Betti numbers ^(e("^o),3e(mo)) q£ ^j^g function 
dp J. are equal to the Betti numbers of the set K, I3{K). 



Choice of the mass parameter. This language also suggests a strategy for 
choosing a mass parameter mo for the distance to a measure, a question that 
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persistence = 0.35 




m.a = k/N 



Figure 2. (PL-approximation of the) 1-dimensional persis- 
tence vineyard of the witnessed fc-distance function. Topolog- 
ical features of the space, obscured by noise for low values of 
mo, stand out as we increase the mass parameter. 

has not been addressed by the original paper [5j. For every mass parameter 
mo, the p-dimensional persistence diagram PerSp(d^^rno) ^ °f points 
{(6j(mo), di(mo))}j in the extended plane (MU {oo})^. Each of these points 
represents a homology class of dimension p in the sublevel sets of d^^m^; 
bi{mQ) and (ij(mo) are the values at which it is born and dies. Since the 
distance to measure dip^mo depends continuously on mo, by [8] so do its 
persistence diagrams. Thus, one can use the algorithm in |9] to track their 
evolution. Figure[2]illustrates such a construction for the point set in Figure[T] 
and the witnessed fc-distance. It displays the evolution of the persistence 
(di (mo) —61 (mo)) of each of the 1-dimensional homology classes as mo varies, 
thus highlighting the choices of the mass parameter that lead to the presence 
of the two prominent classes (corresponding to the two loops of the figure-8). 
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Recovering Betti numbers. Letting K be the support of a measure fi, 
and P a point sample drawn from a distribution v approximating /i, we 
denote with and the sublevel sets dA'(— oo,r] and dJ,i^{—oo,r] of the 
distance to K and the witnessed /c-distancc to the uniform measure on P, 
respectively. With e(mo) = SAtHq ^^^a -\- 2Am\l^aiJ'^^ , we have the following 
sequence of inclusions: 

C P'=('"o) C ^2e(mo) p3e(mo) ^4e(mo) 

Assuming K has a weak feature size R, and e(m,o) < i?/4, function has 
no critical values in the range (0, i?) 5 (0, 4e(mo)); and therefore the rank 
of the image on the homology induced by inclusion H(P^(™o)) H(p3e(mo)) 
is equal to the Betti numbers of the set K. 
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